Website Reconstruction using the Web Infrastructure
نویسنده
چکیده
Backup or preservation of websites is often not considered until after a catastrophic event has occurred. In the face of complete website loss, webmasters or concerned third parties may be able to recover some of their website from the Internet Archive. Other pages may also be salvaged from commercial search engine (SE) caches if caught in time. We introduce the concept of “lazy preservation”digital preservation performed as a result of the normal operations of the Web infrastructure (search engines and caches). We will investigate methods of how websites can be automatically reconstructed from the Web infrastructure (WI) by using a web-repository crawler. We propose to evaluate and measure the effectiveness of various web-repository crawler strategies and evaluate various methods for injecting the generative functionality (e.g., CGI programs, databases, etc.) of websites into the WI. We also propose to develop methods for tracking resources as they move through the WI and to characterize SE caches using random sampling.
منابع مشابه
Reconstructing Websites for the Lazy Webmaster
Backup or preservation of websites is often not considered until after a catastrophic event has occurred. In the face of complete website loss, “lazy” webmasters or concerned third parties may be able to recover some of their website from the Internet Archive. Other pages may also be salvaged from commercial search engine caches. We introduce the concept of “lazy preservation”digital preservati...
متن کاملElectronic Educational and Research Services as Infrastructure for the E- Government: Role of
Introduction: Websites serve as an initial step toward an e- government adoption which facilitates delivery of online services to customers. The existing study intended to investigate the role of university website to render educational and research services based on e- government maturity model in Iranian universities. Methods: This descriptive and cross- sectional study was conducted through...
متن کاملIdentification and Classification of Desirable Web-Based Services from the Perspective of Website Users of Iran’s Hospitals Based on Kano Model of Customer Satisfaction
Background and Aim: A hospital website is an appropriate system for exchanging information and connecting patients, hospitals and medical staff. The purpose of this study was to identify and classify desirable web-based services in websites of Iran's hospitals based on Kano’s Customer Satisfaction Model. Materials and Methods: This was a survey study. The statistical population of the study co...
متن کاملDesigning a System for Trend Analysis of Users in Website Surfing in Iran Using Data Mining and Text Mining Algorithms
Background and Aim: As of the entrance of web surfing to the lifestyle of a vast majority of people in the society and the need for a more accurate social and cultural policy making in the field, authors intended to analyze the behavior of the society users in viewing different websites so as to help politicians and practitioners. Methods: Design science research method is used in this research...
متن کاملGeoreferencing Semi-Structured Place-Based Web Resources Using Machine Learning
In recent years, the shared content on the web has had significant growth. A great part of these information are publicly available in the form of semi-strunctured data. Moreover, a significant amount of these information are related to place. Such types of information refer to a location on the earth, however, they do not contain any explicit coordinates. In this research, we tried to georefer...
متن کامل